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Rare events are events that are expected to occur infrequently, or more technically, those that 
have low probabilities (say, order of 10“^ or less) of occurring according to a probability model. 
In the context of uncertainty quantihcation, the rare events often correspond to failure of systems 
designed for high reliability, meaning that the system performance fails to meet some design or 
operation specihcations. As reviewed in this section, computation of such rare-event probabilities 
is challenging. Analytical solutions are usually not available for non-trivial problems and standard 
Monte Carlo simulation is computationally inefficient. Therefore, much research effort has focused 
on developing advanced stochastic simulation methods that are more efficient. In this section, we 
address the problem of estimating rare-event probabilities by Monte Carlo simulation. Importance 
Sampling and Subset Simulation for highly reliable dynamic systems. 

Keywords: Subset Simulation, Monte Carlo simulation, Markov chain Monte Carlo, rare events, failure 
probability, engineering reliability. 


I. INTRODUCTION 

We focus on rare-event simulation for addressing relia¬ 
bility problems corresponding to dynamic systems. To 
compute the rare event (failure) probability for a dy¬ 
namic system, both input (excitation) and modeling un¬ 
certainties should be quantified and propagated. There¬ 
fore, a probability model must be chosen to describe the 
uncertainty in the future input for the system and then a 
chosen deterministic or stochastic system model is used, 
preferably in conjunction with a probability model de¬ 
scribing the associated modeling uncertainties, to propa¬ 
gate these uncertainties. These input and system models 
define a probabilistic description of the system output 
(response). For example, the problem of interest might 
be to compute the small failure probability for a highly 
reliable dynamic system such as a bridge or building un¬ 
der uncertain future earthquake excitation, or for an air¬ 
craft under uncertain excitation by turbulence, using a 
finite-element structural model to approximate the dy¬ 
namics of the system. This model will usually be subject 
to both parametric uncertainty (what values of the model 
parameters best represent the behavior of the system?) 
and non-parametric modeling uncertainty (what are the 
effects of the aspects of the system behavior not captured 
by the dynamic model?). The treatment of input uncer¬ 
tainty has a long history in dynamic reliability theory and 
random vibrations, now more commonly called stochas¬ 
tic dynamics, but the treatment of modeling uncertainty 
is more recent. 

Usually the dynamic model of the system is repre¬ 
sented by a time-dependent BVP (boundary-value prob- 
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lem) involving PDEs (partial differential equations) or by 
a set of coupled ODEs (ordinary differential equations). 
Typically the failure event is defined as any one of a set 
of performance quantities of interest exceeding its spec¬ 
ified threshold over some time interval. This is the so- 
called first-passage problem. This challenging problem 
is characterized by a lack of analytical solutions, even 
for the simplest case of a single-degree-of-freedom lin¬ 
ear oscillator subject to excitation that is modeled as a 
Gaussian process. Approximate analytical methods ex¬ 
ist that are usually limited in scope and their accuracy 
is difficult to assess in a given application [43l [H] . Semi- 
analytical methods from structural reliability theory such 
as FORM and SORM (first- and second-order reliability 
methods) [20l |43] cannot be applied directly to the first- 
passage problem and are inapplicable, anyway, because 
of the high-dimensional nature of the discrete-time in¬ 
put history [35J . Standard Monte Carlo simulation 

has general applicability but it is computationally very 
inefficient because of the low failure probabilities. As a 
consequence, advanced stochastic simulation schemes are 
needed. 


A. Mathematical formulation of problem 

We assume that initially there is a continuous-time de¬ 
terministic model of the real dynamic system that con¬ 
sists of a state-space model with a finite-dimensional 
state X{t) G M" at time t and this is converted to a 
discrete-time state-space model using a numerical time¬ 
stepping method to give: 

A(t)eK", C/(f)GlR’", t = 0,...,T, ^ 

where U{t) G K™ is the input at discrete time t. 
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If the original model consists of a BVP with PDEs de¬ 
scribing a response u{x, t) where x G then we assume 
that a finite set of basis functions {(/>i(x),..., is 

chosen (e.g. global bases such as Fourier and Hermite 
polynomials or localized ones such as finite-element in¬ 
terpolation functions) so that the solution is well approx¬ 
imated by: 

n 

u{x, t) « ^ Xi{t)(l>i{x). (2) 

i=l 

Then a numerical method is applied to the BVP PDEs 
to establish time-dependent equations for the vector of 
coefficients X{t) = [Xi{t),..., Xn{t)] so that the stan¬ 
dard state-space equation in still applies. For ex¬ 
ample, for a finite-element model of a structural system, 
{4>i{x )^..., 4)n{x)} would be local interpolation functions 
over the elements. Then, expressing the BVP in weak 
form, a weighted residual or Galerkin method could be 
applied to give a state-space equation for the vector of 
coefficients X{t) |27) . 

Suppose that a positive scalar performance function 
g{X{t)) is a quantity of interest and that the rare event 
£ of concern is that g{X{t)) exceeds a threshold b over 
some discrete-time interval t = 0,..., T: 


be readily transformed so that the components are in¬ 
dependent and the PDF is a product of one-dimensional 
Gaussian PDFs. In many applications, the discrete-time 
stochastic input history is modeled by running discrete¬ 
time Gaussian white noise through a digital filter to 
shape its spectrum in the frequency domain, and then 
multiplying the filtered sequence by an envelope function 
to shape it in the time domain, if it is non-stationary. 

The model in Q may also depend on uncertain param¬ 
eters 9 G & C MP which includes the initial values V(0) 
if they are uncertain. Then a prior PDF p{9) may be 
chosen to quantify the uncertainty in the value of vector 
6. Some of the parameters may characterize the PDF for 
input U which can then be denoted p{u\9). It is conve¬ 
nient to re-define vector U to also include 9, then the new 
PDF p{u) is p{u\9)p{9) in terms of the previous PDFs. 
We assume that model parameter uncertainty is incor¬ 
porated in this way, so the basic equations remain the 
same as 0, ^ , and Q . When model uncertainty is in¬ 
corporated, the calculated p£ has been referred to as the 
robust rare-event probability unmo], meaning robust to 
model uncertainty, as in robust control theory. 


II. STANDARD MONTE CARLO SIMULATION 


£ = |c7 = (C/(0),..., U{T)) :^_max^g(V(t)) > b 
£ c 

where X{t) satisfies Q. The performance function 
g{X{t)) may involve exceedance of multiple performance 
quantities of interest {gk{X{t)) : k = above 

their corresponding thresholds {ofe}. This can be accom¬ 
plished by aggregating them using the max and min op¬ 
erators in an appropriate combination on the set of gkS; 
for example, for a pure series failure criterion, where 
the threshold exceedance of any ak represents failure, 
one takes the aggregate performance failure criterion as 
g{X{t)) = max{gk{X{t))/ak : k = 1,..., K} > 1, while 
for a pure parallel failure criterion, where all of the gk 
must exceed their thresholds before failure is considered 
to have occurred, one takes the aggregate performance 
failure criterion as g{X(t)) = min{gfe(V(t))/afc : k = 
h...,K}>l. 

If the uncertainty in the input time history vector 
U = [17(0),...,C/(r)] G {D = m X {T + 1)) is 
quantified by a probability distribution for U that has a 
PDF (probability density function) p{u) with respect to 
Lebesgue integration over K^, then the rare-event prob¬ 
ability is given by: 

P£ = F{U G £) = j p{u)du. (4) 

The PDF p{u) is assumed to be readily sampled. Al¬ 
though direct sampling from a high-dimensional PDF 
is not possible in most cases, multi-dimensional Gaus- 
sians are an exception because the Gaussian vector can 



The standard Monte Carlo Simulation method (MGS) 
is one of the most robust and straightforward ways to 
simulate rare events and estimate their probabilities. The 
method was originally developed in m for solving prob¬ 
lems in mathematical physics. Since then MGS has been 
used in many applications in physics, statistics, computer 
science, and engineering, and currently it lays at the 
heart of all random sampling-based techniques [35l |33] . 

The basic idea behind MGS is to observe that the prob¬ 
ability in Q can be written as an expectation: 

Pe = [ l£iu)p{u)du = Ep[l£], (5) 

where l£ is the indicator function of £, that is l£{u) = I 
ii u G £ and l£{u) = 0 otherwise, and D = m x {T + 1) 
is the dimension of the integral. Recall that the strong 
law of large numbers [JS] states that if C/i,..., C/at are in¬ 
dependent and identically distributed (i.i.d.) samples of 
vector U drawn from the distribution p{u), then for any 
function h{u) with finite mean Ep[h(u)], the sample aver- 
age ^ h{Ui) converges to the true value Ep[/i(u)] as 
A —> 00 almost surely (i.e. with probability 1). There¬ 
fore, setting h{u) = l£{u), the probability in ([^ can be 
estimated as follows: 


PS 


r,MCS 

Ps 




( 6 ) 


It is straightforward to show that p^^^ is an unbiased 
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estimator of ps with mean and variance: 

N 


^p[Ps^^] = Ep 




2=1 


N 


— Ep [Ig ] — Pf, 


i=l 


Varp[pf = Varp 


N 


N 


Y.^sm 


1 N 


P£(l -Pg) 

N 


( 7 ) 


Furthermore, by the Central Limit Theorem [45] , as N —>■ 
oo, is distributed asymptotically as Gaussian with 

this mean and variance. 

Frequentist interpretation of MCS: The frequentist in¬ 
terpretation of MCS focuses on the forward problem, ar¬ 
guing that if N is large so that the variance of is 

relatively small, then the value p^^^ based on ® for a 
specific set of N samples {17i, ..., Un} drawn from p{u) 
should be close to the mean pg of pg^^^. The sample 
mean estimate Pg^^^ is very intuitive and, in fact, simply 
reflects the frequentist definition of probability: p^'^^ is 
the ratio between the number of trials where the event 
£ occurred, Ng = J2f=i h{Ui), and the total number of 
trials N. 

Bayesian interpretation of MCS: The same MCS es¬ 
timate p^^^ has a simple Bayesian interpretation (e.g. 
[56jL which focuses on the inverse problem for the specific 
set of N samples {Ui,... ,Un} drawn from p{u). Follow¬ 
ing the Bayesian approach |26j , the unknown probability 
pg is considered as a stochastic variable whose value in 
[0,1] is uncertain. The Principle of Maximum Entropy 
[25j leads to the uniform prior distribution for pg , p{pg ) = 
1) 0 < p£ < 1, which implies that all values are taken 
as equally plausible a priori. Since samples Ui,... ,Un 
are i.i.d, the binary sequence Ig{Ui),... ,Ig{UN) is a se¬ 
quence of Bernoulli trials, and so for the forward prob¬ 
lem, Ng is distributed according to the binomial distri¬ 
bution with parameters N and pg, Ng ~ Bin(A^,p^). 
Therefore, for the set of N samples, the likelihood 

function is p{Ng\pg,N) = - Pe)^~^^ ■ Us¬ 

ing Bayes’ Theorem, the posterior distribution for pg, 
p(p£|iV£,A^) oc p{pg)p{Ng\pg,N), is therefore the beta 
distribution Beta(A^£: -I- 1, iV — Ng + 1), i.e. 


p{pg\Ng,N) 


B{Ng + 1,N-Ng + 1)' 


( 8 ) 


where the beta function B is the normalizing constant 
that equals {N + iy./{Ng\{N - Ng)\) here. The MCS 
estimate is the maximum a posteriori (MAP) estimate, 
which is the mode of the posterior distribution Q and 
therefore the most probable value of pg a posteriori: 


^MCS _ 

"Iv- 


( 9 ) 


Notice that the posterior PDF in Q gives a complete 
description of the uncertainty in the value of pg based on 
the specific set of N samples of U drawn from p{u). The 
posterior distribution in Q is in fact the original Bayes’ 
result 0, although Bayes’ Theorem was developed in full 
generality by Laplace [34] . 

The standard MCS method for estimating the prob¬ 
ability in Q is summarized in the following pseudo-code. 


_Monte Carlo Simulation_ 

Input: 

► N, total number of samples. 

Algorithm: 

Set Ng = 0, number of trials where the event £ 
occurred. 

for i = 1,..., N do 

Sample the input excitation 
U, = {UyO),...,UyT))^p{u). 

Compute the system trajectory 

X, = (x,(o),...,x,(r)) 

using the system model 0 with U{t) = Ui{t). 
if max q(Xi(t)) > b 

t=0,...,T 

Ng -(r- Ng + 1 

end if 
end for 
Output: 

► pMCS ^ Mcg estimate of p£ 

► pipg\Ng,N) = posterior PDF 

of Pf 


Assessment of accuracy of MCS estimate: For the 
frequentist interpretation, the coefficient of variation 
(c.o.v.) for the estimator p¥^^ given by 0, conditional 
on pg and N, is given by 0: 


5{pf^^\pg,N) 


yjNarp[pf^¥ 

Ep[pf^^] 


1 - pg 
Npg 


( 10 ) 


This can be approximated by replacing pg by the es¬ 
timate p¥'^^ ~ Ng/N for a given set of N samples 


6{pf^^\pg,N) 


-JfWcs- 


^ Tmcs 

— Off 


( 11 ) 


For the Bayesian interpretation, the posterior c.o.v. 
for the stochastic variable pg, conditional on the set of 
N samples, follows from 0: 


5{Pe\Ng,N) 


\/YaT[pg\Ng,N] 

E[p£|7V£,iV] 



l_^MCS 

NpMCS 


_ 7mcs 

— I 


(12) 
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as TV —)■ cxD. Therefore, the same expression can be 

used to assess the accuracy of the MCS estimate, even 
though the two c.o.v.s have distinct interpretations. 


The approximation 5^'^^ for the two c.o.v.s reveals 
both the main advantage of the standard MCS method 
and its main drawback. The main strength of MCS, 
which makes it very robust, is that its accuracy does 
not depend on the geometry of the domain £ C and 
its dimension D. As long as an algorithm for generat¬ 
ing i.i.d. samples from p{u) is available, MCS, unlike 
many other methods (e.g. numerical integration), does 
not suffer from the “curse of dimensionality.” Moreover, 
an irregular, or even fractal-like, shape of £ will not affect 
the accuracy of MCS. 

On the other hand, the serious drawback of MCS is 
that this method is not computationally efficient in esti¬ 
mating the small probabilities p£ corresponding to rare 
events, where from (10), 


(13) 


Therefore, to achieve a prescribed level of accuracy ^ < 1, 
the required total number of samples is A' = (pf <5^)“^ 

1. For each sampled excitation Ui, a system analysis — 
usually computationally very intensive — is required to 
compute the corresponding system trajectory Xi and to 
check whether Ui belongs to £. This makes MCS exces¬ 
sively costly and inapplicable for generating rare events 
and estimating their small probabilities. Nevertheless, 
essentially all sampling-based methods for estimation of 
rare event probability are either based on MCS (e.g. Im¬ 
portance Sampling) or have it as a part of the algorithm 
(e.g. Subset Simulation). 


III. IMPORTANCE SAMPLING 

The Importance Sampling (IS) method belongs to the 
class of variance reduction techniques that aim to in¬ 
crease the accuracy of the estimates by constructing 
(sometimes biased) estimators with a smaller variance 
[TJ [22] . It seems it was first proposed in [22] , soon after 
the standard MCS method appeared. 

The inefficiency of MCS for rare event estimation stems 
from the fact that most of the generated samples Ui ~ 
p{u) do not belong to £ so that the vast majority of the 
terms in the sum (|^ are zero and only very few (if any) 
are equal to one. The basic idea of IS is to make use 
of the information available about the rare event £ to 
generate samples that lie more frequently in £ or in the 
important region £ <Z £ that accounts for most of the 
probability content in (|^. Rather than estimating p£ as 
an average of many O’s and very few I’s like in p^^^, IS 
seeks to reduce the variance by constructing an estimator 
of the form ^ where N' is an appreciable 

fraction of N and the Wi are small but not zero, ideally 
of the same order as the target probability, Wi ^ pe- 


Specifically, for an appropriate PDF q{u) on the exci¬ 
tation space the integral in ([^ can be re-written as 
follows: 


P£ 



l£(u)p(u)du 



Ie{u)p{u) 

q{u) 


q(u)du = Eg 


hp 

9 


(14) 


The IS estimator is now constructed similarly to (|^ by 
utilizing the law of large numbers: 


PS 


„is 

Ps 


1 ^ ismpm 

N ^ qm 

^^/£(I/.)u;(C/.), 


(15) 


where Ui,... ,Un are i.i.d. samples from q(u), called the 
importance sampling density (ISD), and w{Ui) = is 
the importance weight of sample Ui. 

The IS estimator converges almost surely as N ^ 
oo to p£ by the strong law of large numbers, provided 
that the support of q{u), i.e. the domain in where 
q{u) > 0, contains the support of l£(u)p(u). Intuitively, 
the latter condition guarantees that all points of £ that 
can be generated by sampling from the original PDF 
p{u), can also be generated by sampling from the ISD 
q{u). Note that if q{u) = p{u), then w{Ui) = 1 and IS 
simply reduces to MCS, p^^^ = By choosing the 
ISD q{u) appropriately, IS aims to obtain an estimator 
with a smaller variance. 

The IS estimator is also unbiased with mean and 
variance: 


^g[P£ ] = E? 


N 


-J2l£{U.)w{U.) 


N 


“VS". 


1 


i=l 

N 


hp 


= P£, 


Var,[p|^] = 

2=1 


hp 




hP^ 


Ps 


(16) 


The IS method is summarized in the following pseudo¬ 
code. 


_Importance Sampling;_ 

Input: 

> N, total number of samples. 

> q{u), importance sampling density. 

Algorithm: 

Set j = 0, counter for the number of samples in £. 
for i = 1,..., N do 

Sample the input excitation 
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U, = {U,iO),...,U,{T))^q{u). 

Compute the system trajectory 
X, = {X,{0),...,X,{T)) 

using the system model Q with U{t) = Uiit). 
if max qlXAt)) > b 

t=0,...,T 

J t— j + 1 

Compute the importance weight of the 
sample in S, wj = . 

end if 
end for 

Ng = j, the total number of trials where the event 
£ occurred. 

Output: 

► Pe^ = —estimate of pg. 


The most important task in applying IS for estimating 
small probabilities of rare events is the construction of 
the ISD, since the accuracy of depends critically on 
q{u). If the ISD is “good”, then one can get great im¬ 
provement in efficiency over standard MCS. If, however, 
the ISD is chosen inappropriately so that for instance 
Ng = 0 or the importance weights have a large variation, 
then IS will yield a very poor estimate. Both scenar¬ 
ios are demonstrated below in the Section “Illustrative 
Example”. 

It is straightforward to show that the optimal ISD, 
which minimizes the variance in (16), is simply the orig¬ 


inal PDF p{u) conditional on the domain £: 


qo{u) =p{u\£) = 


Ig(u)p{u) 


PS 


(17) 


Indeed, in this case, all generated sample excitations sat¬ 
isfy Ui S £, so their importance weights w{Ui) = pg, and 
the IS estimate pg^ = pg. Moreover, just one sample 
(^N = 1) generated from qo{u) is enough to find the prob¬ 
ability Pg exactly. Note, however, that this is a purely 
theoretical result since in practice sampling from the con¬ 
ditional distribution p(it|f) is challenging, and, most im¬ 
portantly, it is impossible to compute qo{u): this would 
require the knowledge of pg , which is unknown. Never¬ 
theless, this result indicates that the ISD q{u) should be 
chosen as close to qo{u) as possible. In particular, most of 
the probability mass of q{u) should be concentrated on £. 
Based on these considerations, several ad hoc techniques 
for constructing ISDs have been developed, e.g. variance 
scaling and mean shifting m- 

In the special case of linear dynamics and Gaussian 
excitation, an extremely efficient algorithm for estimat¬ 
ing the rare-event probability pg in 0 , referred to as 
ISEE [Importance Sampling using Elementary Events), 
has been presented [3]. The choice of the ISD exploits 
known information about each elementary event, defined 
as an outcrossing of the performance threshold b in 0 
at a specific time t G {0,... ,T}. The c.o.v. of the ISEE 


estimator for N samples of U from p{u) is given by 


gISEE 

On 


a 

vW’ 


(18) 


where the proportionality constant a is close to 1, re¬ 
gardless of how small the value of pg. In fact, a decreases 
slightly as pg decreases, exhibiting the opposite behavior 
to MCS. 

In general, it is known that in many practical cases of 
rare event estimation it is difficult to construct a good 
ISD that leads to a low-variance IS estimator, especially 
if the dimension of the uncertain excitation space is 
large, as it is in dynamic reliability problems [5]. A geo¬ 
metric explanation as to why IS in often inefficient in high 
dimensions is given in [32] ■ Au |2| has presented an effi¬ 
cient IS method for estimating pg in (|^ for elasto-plastic 
systems subject to Gaussian excitation. In recent years, 
substantial progress has been made by tailoring the se- 
quential importance sampling (SIS) methods [S^, where 
the ISD is iteratively refined, to rare event problems. SIS 
and its modifications have been successfully used for es¬ 
timating rare events in dynamic portfolio credit risk m, 
structural reliability [33], and other areas. 


IV. SUBSET SIMULATION 

The Subset Simulation (SS) method jj] is an advanced 
stochastic simulation method for estimating rare events 
which is based on Markov chain Monte Carlo (MGMG) 
[331131]. The basic idea behind SS is to represent a very 
small probability pg of the rare event £ as a product of 
larger probabilities of “more-frequent” events and then 
estimate these larger probabilities separately. To imple¬ 
ment this idea, let 

R^ = £oD£i...d£l=£ (19) 

be a sequence of nested subsets of the uncertain excita¬ 
tion space starting from the entire space £o = and 
shrinking to the target rare event £g = £. By analogy 
with ([ 3 ]), subsets Si can be defined by relaxing the value 
of the critical threshold b: 

= |c/ G ^ inax^5(A(t)) > , (20) 

where bi < ... < bg = b. In the actual implementa¬ 
tion of SS, the number of subsets L and the values of 
intermediate thresholds {bi} are chosen adaptively. 

Using the notion of conditional probability and exploit¬ 
ing the nesting of the subsets, the target probability pg 
can be factorized as follows: 

L 

pg=l[F{£,\£,_,). (21) 

i=l 

An important observation is that by choosing the in¬ 
termediate thresholds {bi} appropriately, the conditional 
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events {£i\£i-i\ can be made more frequent, and their 
probabilities can be made large enough to be amenable 
to efficient estimation by MCS-like methods. 

The first probability P(£i|£o) = P(^i) can be readily 
estimated by standard MCS: 

1 ” 

(22) 

i=i 


where Ui,... ,Un are i.i.d. samples from p{u). Estimat¬ 
ing the remaining probabilities P(£i|fi_i), z > 2 , is more 
challenging since one needs to generate samples from the 

conditional distribution p(u|£i_i) = which, 

in general, is not a trivial task. Notice that a sample 
U from p{u\£i-i) is one drawn from p{u) that lies in 
£i-i. However, it is not efficient to use MCS for gener¬ 
ating samples from p{u\£i-i): sampling from p{u) and 
accepting only those samples that belong to £i-i is com¬ 
putationally very expensive, especially at higher levels i. 

In standard SS, samples from the conditional distri¬ 
bution p[u\£i-i) are generated by the modified Metropo¬ 
lis algorithm (MMA) |3] which belongs to the family of 
MCMC methods for sampling from complex probability 
distributions that are difficult to sample from directly 
[551 H5] . An alternative strategy — splitting — is de¬ 
scribed in the next section. 

The MMA algorithm is a component-wise version of 
the original Metropolis algorithm [55]. It is specifically 
tailored for sampling from high-dimensional conditional 
distributions and works as follows. First, without loss of 
generality, assume that p{u) = Y[k=i Pk{''^k), i-C. compo¬ 
nents of U are independent. This assumption is indeed 
not a limitation, since in simulation one always starts 
from independent variables to generate correlated exci¬ 
tation histories U. Suppose further, that some vector 
Ui S is already distributed according to the tar¬ 
get conditional distribution, Ui ~ p{u\£i-i). MMA pre¬ 
scribes how to generate another vector U 2 ~ p{u\£i-i) 
and it consists of two steps: 


1. Generate a “candidate” state V as follows: first, 
for each component k = 1,..., D of V, sample I'ik) 
from the symmetric univariate proposal distribution 
qk,i{i^\Ui{k)) centered on the component of Ui, 
where symmetry means that qk,i{k'\u) = qk,iiu\v); 
then, compute the acceptance ratio 
finally, set 


V{k) = 


v{k), with prob. min{l,rfc}, 
Ui{k), with prob. 1 —min{I,rfe}. 


(23) 


2 . Accept or reject the candidate state V: 


The procedure for generating conditional samples 
at level i is as follows. Starting from a “seed” 
Ui ^ p{u\£i-i), one can now use MMA to generate a 
sequence of random vectors Hi,..., t/„, called a Markov 
chain, distributed according to p(u\£i-i). At each step, 
Uj is used to generate the next state H^+i. Note that al¬ 
though these MCMC samples are identically distributed, 
they are clearly not independent: the correlation between 
successive samples is due to the proposal PDFs {qk,i} 
at level i that govern the generation of Hj+i from Uj. 
Nevertheless, Ui,... ,Un can still be used for statistical 
averaging as if they were i.i.d, although with certain re¬ 
duction in efficiency |3] . In particular, similarly to (22 1 , 
the conditional probability R{£i\£i-i), can be estimated 
as follows: 




(25) 


i=i 


To obtain an estimator for the target probability 
P £, it remains to multiply the MCS (22) and MCMC 
(25) estimators of all factors in (21). In real appli¬ 


cations, however, it is often difficult to rationally de¬ 
fine the subsets {£i\ in advance, since it is not clear 
how to specify the values of the intermediate thresh¬ 
olds {hi}. In SS, this is done adaptively. Specifi¬ 
cally, let ,..., Un'^ be the MCS samples from p {^ , 


A 


(0) 


,..., be the corresponding trajectories from Q, 


and = maxt^o, 


T g{X^^\t)) be the resulting per¬ 
formance values. Assume that the sequence is 

ordered in non-increasing order, i.e. > ... > Gn\ 

renumbering the samples where necessary. Define the 
first intermediate threshold bi as follows: 


H = 


^( 0 ) I ^ 

^npo \ Lt, 


(0) 

npa-\-l 


(26) 


where po is a chosen probability satisfying 0 < po < 1- 
This choice of 61 has two immediate consequences: first, 
the MCS estimate of P(fi) in (22) is exactly po, and, 
second, .. ,Unpa not only belong to £ 1 , but also 

are distributed according to the conditional distribution 
p(u\£i). Each of these npo samples can now be used as 
mother seeds in MMA to generate (^ — 1) offspring, giv¬ 
ing a total of n samples U^\ ..., Un'^ ~ p{u\£i). Since 
these seeds start in the stationary state p{u\£{) of the 
Markov chain, this MCMC method gives perfect sam¬ 
pling, i.e. no wasteful burn-in period is needed. Similarly, 
62 is defined as 


( V, ifVe£,.i, 
\ Hi, ifVi£,.i. 


(24) 


&2 


G 


( 1 ) 


+ G 

~Y~ 


(1) 

npo + l 


(27) 


It can be shown that U 2 generated by MMA is indeed 
distributed according to the target conditional distribu¬ 
tion p{u\£i-i) when Hi is Uj. For a detailed discussion 
of MMA, the reader is referred to [55] . 


where are the (ordered) performance values cor¬ 

responding to excitations {Hj^^}. Again by construc¬ 
tion, the estimate (25) gives R[£ 2 \£i) ~ Po, and 
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[/(i) jj 

Ui ,. . ., u, 


( 1 ) 

npo 


p(u\£ 2 )- The SS method proceeds in 
this manner until the target rare event £ is reached and 
is sufficiently sampled. All but the last factor in ( pTj ) 
are approximated by poi and the last factor P(£^|£i_i)S! 
— > Po I where ns is the number of samples in £ among 


U 


(L-l) 


,..., C/, 


(L-l) 


p{u\£l-i)- The method is more 


formally summarized in the following pseudo-code. 


Subset Simulation 


Input: 

> n, number of samples per conditional level. 

> PO) level probability; e.g. po = 0 . 1 . 

1 > {ft.i}) proposal distributions; 

e.g. qk,i{v\u) =N{v\u,al ^). 

Algorithm: 

Set i = 0, number of conditional level. 

Set = 0, number of the MCS samples in £. 


Sample the input excitations ,..., Un'^ 
Compute the corresponding trajectories 


'p(ii). 


X- 


(0) 




(0) 


for j = 1,..., n do 


if G 


(0) 


= max p(X-°Vt)) > b do 


,(o) 


end if 
end for 

jn < pfj do 


1 


while 


i ^ i + 1^ a new subset £i is needed. 
Sort so that 

> ... > G^n~^\ 

Define the intermediate threshold: 




G 


0 - 1 ) 


^npo TI /2 . 


b, = (g^ 

for j = 1,..., npo do 

Using Wj^i = p{u\£i) as a 

seed, use MM A to generate (A — i) 
additional states of a Markov chain 
w;j-,i,...,iUj-,i/po ^p{u\£i). 

end for 

Renumber: 

^ ..., - p{u%). 

Compute the corresponding trajectories 


A 


,X 


i) 


for j = 1,..., n do 


if G 


(i) 


= max (t)) > b do 

0 ) 

■£ 


1 


Ug ^ n 

end if 
end for 
end while 

L = j -I- 1, number of levels, i.e. subsets £i in (19) 


and ( 20 ). 


N = n + n{l—po){L — l), total number of samples. 
Output: 

SS estimate of pp- 


(L-l) 

:^SS _ 1 


► Ps = Po 


Implementation details of SS, in particular the choice 
of level probability po and proposal distributions {qk}, 
are thoroughly discussed in [SS]. It has been confirmed 
that Po = 0.1 proposed in the original paper [ 3 | is a nearly 
optimal value. The choice of {qk,i} is more delicate, since 
the efficiency of MMA strongly depends on the proposal 
PDF variances in a non-trivial way: proposal PDFs with 
both small and large variance tend to increase the cor¬ 
relation between successive samples, making statistical 
averaging in (25) less efficient. In general, finding the op¬ 
timal variance of proposal distributions is a challenging 
task not only for MMA, but also for almost all MCMC al¬ 
gorithms. Nevertheless, it has been found in many appli¬ 
cations that using qk,i(^n\u) = Af{v\u, J, the Gaussian 
distribution with mean u and variance tr^ ^, yields good 
efficiency if ^ = Cg and p{u) is a multi-dimensional 
Gaussian with all variances equal to ctq. For an adaptive 
strategy for choosing {qk,i}, the reader is referred to [5S] : 
for example, cr^ ■ = cr^ can be chosen so that the observed 
average acceptance rate in MMA, based on a subset of 
samples at level f, lies in the interval [0.3,0.5]. 

It can be shown HIT] that, given p£, po, and the total 
number of samples X, the c.o.v. of the SS estimator pf^ 
is given by 


5'^{.Pe^\P£.Po,N) = 


(l+7)(l-Po)„^p_i 


Xpo(lnpo ^ 


)^ (28) 


where 2 < r < 3 and 7 is approximately a constant that 
depends on the state correlation of the Markov chain at 
each level. Numerical experiments show that r = 2 gives 
a good approximation to the c.o.v. and that 7 « 3 if 
the proposal variance of for each level is appropriately 
chosen IHITIIM!. It follows from ( [l^ that ^ Ps^ 

for MCS, while for SS, i5|_g (x (Inp^ Y. This drastically 
different scaling behavior of the c.o.v.’s with small ps 
directly exhibits the improvement in efficiency. 

To compare an advanced stochastic simulation algo¬ 
rithm directly with MCS, which is always applicable (but 
not efficient) for rare event estimation, |llj introduced 
the relative computation efficiency of an algorithm, pAi 
which is defined as the ratio of the number of samples 
Nmcs required by MCS to the number of samples Na 
required by the algorithm for the same c.o.v. 5. The 
relative efficiency of SS is then 


X, 


Pss = 


MCS 


Po(lnPo 


Nss (Y + l){l-po)PEYa.Ps^Y 
0.03p7^ 


(29) 


(logioPf 


for r = 2, 7 = 3, and po = 0.1. For rare events, p^ ^ is 
very large, and, as expected, SS outperforms MCS; for 
example, if ps = 10 “®, then pss ~ 800. 

In recent years, a number of modifications of SS have 
been proposed, including SS with Splitting HZ! (de¬ 
scribed in the next section). Hybrid SS jlH], Two-Stage 















SS |30], Spherical SS [3T], and SS with delayed rejec¬ 
tion m- A Bayesian post-processor for SS, which gen¬ 
eralizes the Bayesian interpretation of MCS described 
above, was developed in [SS] • In the original paper [J] , SS 
was developed for estimating reliability of complex civil 
engineering structures such as tall buildings and bridges 
at risk from earthquakes. It was applied for this pur¬ 
pose in and [53]. SS and its modifications have also 
been successfully applied to rare event simulation in fire 
risk analysis | 8 ], aerospace |39l |52|, nuclear [16], wind 
[15] and geotechnical engineering [15], and other fields. 
A detailed exposition of SS on an introductory level and 
a MATLAB code implementing the above pseudo-code is 
given in [55]. For more advanced and complete reading, 
the fundamental monograph on SS [7] is strongly recom¬ 
mended. 


V. SPLITTING 


In the previously presented stochastic simulation 
methods, samples of the input and output discrete-time 
histories, {U{t) : t = 0, ...,T} C and {X{t) : t = 
0, ...,T} C K", are viewed geometrically as vectors U 
and X that define points in the vector spaces 
and respectively. In the splitting method, how¬ 

ever, samples of the input and output histories are viewed 
as trajectories defining paths of length (T -|- 1) in K"* 
and K", respectively. Samples that reach a certain desig¬ 
nated subset in the input or output spaces at some time 
are treated as “mothers” and are then split into multiple 
offspring trajectories by separate sampling of the input 
histories subsequent to the splitting time. These multi¬ 
ple trajectories can themselves subsequently be treated 
as mothers if they reach another designated subset nested 
inside the first subset at some later time, and so be split 
into multiple offspring trajectories. This is continued un¬ 
til a certain number of the trajectories reach the smallest 
nested subset corresponding to the rare event of interest. 

Splitting methods were originally introduced by Kahn 
and Harris [5S] and they have been extensively studied 
(for example, [I2l[l7l|42l[53]). We describe splitting here 
by using the framework of Subset Simulation where the 
only change is that the conditional sampling in the nested 
subsets is done by splitting the trajectories that reach 
each subset, rather than using them as seeds to gener¬ 
ate more samples from Markov chains in their stationary 
state. As a result, only standard Monte Carlo simulation 
is needed, instead of MCMC simulation. 

The procedure in m is followed here to generate off¬ 
spring trajectories at the level (f = I,..., L) of Sub¬ 
set Simulation from each of the mother trajectories in £i 
constructed from samples from the previous level, except 
that we present it from the viewpoint of trajectories in 
the input space, rather than the output space. Therefore, 
at the level, each of the npo sampled input histories 
Uj, j = 1 ,... ,npo, from the previous level that satisfy 
Uj G £i, as defined in (20) (so the corresponding output 


history Xj satisfies max g{Xj{t)) > bi), are split at 
their first-passage time 


tj = min{t = 0,... ,T : g{Xj{t)) > hi} (30) 


This means that the mother trajectory Uj is partitioned 
as [[/“,[/+] where U~ = ... ,Uj(tj)] and C/+ = 

\Uj{tj-\-l ),..., Uj{T)\] then a subtrajectory sample U'^ = 
\Uj{tj -I- 1),..., Uj{T)\ is drawn from 


, f‘{£i\u. 

p«|C7-,£,)= ' 




nmi) 




(31) 


= p(u\\U. ) =p{u\), 


where the last equation follows if one assumes indepen¬ 
dence of the = 0 ,..., r (although it is not nec¬ 

essary). Also, F{£i\u'^,U~) = 1 = F{£i\U~). Note that 

the new input sample Uj = \U~ , U^ ] also lies in £i since it 
has the subtrajectory U~ in common with Uj, which im¬ 
plies that the corresponding outputs at the first-passage 
time tj are equal: Xj(tj) = Xj(tj) > bi. The offspring 
trajectory Uj is a sample from p{u) lying in £i and so, 
like its mother Uj, it is a sample from p{u\£i). This 
process is repeated to generate (A — l) such offspring 
trajectories from each mother trajectory, giving a total 
of np[){X _ ij -\-npo = n input histories that are samples 
from p{u\£i) at the f**' level. 

The pseudo-code for the splitting version of Sub¬ 
set Simulation is the same as the previously presented 
pseudo-code for the MCMC version except that the part 
describing the generation of conditional samples at level 
i using the MMA algorithm is replaced by: 


Generation of conditional samples 
at level i with Splitting 

for j = 1,..., npo do 

Using ~ p{u\£i) as a mother trajectory, 

generate (A — l) offspring trajectories by splitting 
of this input trajectory. 

end for 


To generate the same number of samples n at a level, 
the splitting version of Subset Simulation is slightly more 
efficient than the MCMC version using MMA because 
when generating the conditional samples, the input off¬ 
spring trajectories U = [U~,U~^] already have available 
the first part X~ of the corresponding output trajectory 
X = [X~,X~^]. Thus, ([^ need only be solved for 
starting from the final value of X~ (which corresponds to 
the first-passage time of the trajectory). A disadvantage 
of the splitting version is that it cannot handle param¬ 
eter uncertainty in the model in 0 since the offspring 
trajectories must use 0 with the same parameter val¬ 
ues as their mothers. Furthermore, the splitting version 
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applies only to dynamic problems, as considered here. 
The MCMC version of Subset Simulation can handle pa¬ 
rameter uncertainty and is applicable to both static and 
dynamic uncertainty quantification problems. 

Ching, Au and Beck m discuss the statistical prop¬ 
erties of the estimators corresponding to (22) and (25) 
when the sampling at each level is done by the trajectory 
splitting method. They show that as long as the condi¬ 
tional probability in Subset Simulation satisfies po > 0.1, 
the coefficient of variation for pg when estimating it by 
(21) and (25) is insensitive to po- 

Ching, Beck and Au m also introduce a hybrid ver¬ 
sion of Subset Simulation that combines some advantages 
of the splitting and MCMC versions when generating the 
conditional samples Uj,j = l,...,n at each level. It 
is limited to dynamic problems because of the splitting 
but it can handle parameter uncertainty through using 
MCMC. All three variants of Subset Simulation are ap¬ 
plied to a series of benchmark reliability problems in [B] ; 
their results imply that for the same computational effort 
in the dynamic benchmark problems, the hybrid version 
gives slightly better accuracy for the rare-event proba¬ 
bility than the MCMC version. For a comparison be¬ 
tween these results and those of other stochastic simula¬ 
tion methods that are applied to some of the same bench¬ 
mark problems (e.g. Spherical Subset Simulation, Aux¬ 
iliary Domain Method and Line Sampling), the reader 
may wish to check [47]. 


VI. ILLUSTRATIVE EXAMPLE 

To illustrate MCS, IS, and SS with MCMC and Split¬ 
ting for rare event estimation, consider the following 
forced Lorenz system of ordinary differential equations: 

Xi=a{X2-Xi) + Uit), (32) 

Aa = rXi - Aa - A 1 A 3 , (33) 

^ = AiAa - 6 A 3 , (34) 

where X{t) = (Ai(t), Aa(t), A 3 (t)) defines the system 
state at time t and U{t) is the external excitation to the 
system. If U{t) = 0, these are the original equations due 
to E. N. Lorenz that he derived from a model of fluid 
convection |36| . In this example, the three parameters 
O’, r, and b are set to cr = 3, 6 = 1, and r = 26. It is 
well-know (e.g. m) that in this case, the Lorenz system 
has three unstable equilibrium points, one of which is 

A* = (v'Kr - 1), V 6 (r - I), r - l) = (5, 5, 25), (35) 

that lies on one “wing” of the “butterfly” attractor. Let 

A(0) = A* -h (1/2,1/2,1/2) = (5.5,5.5,25.5) (36) 

be the initial condition, and X{t) be the correspond¬ 
ing solution. Lorenz showed |3B] that the solution of 





20 40 60 80 100 






FIG. 1: The left column shows the solution of the unexcited Lorenz 
system (a = 0) enclosed in the bounding ellipsoid E (top) and the 
corresponding response function g(t) (bottom), where t G [0, T], 
T = 100. The right top panel shows the solution of the forced 
Lorenz system (o = 3) that corresponds to an excitation U £ €. 
As it is clearly seen, this solution leaves the ellipsoid E. According 
to the response function g{t) shown in the right bottom panel, this 
first-passage event happens around t = 90. 


( 32|33|34' ) with U(t) = 0 always (for any t) stays inside 
the bounding ellipsoid E: 


xijtr 

<7 


X^(ff {X^(t)-R? 
bR^ 


R = r + a (37) 


Suppose that the system is now excited by U{t) = 
aB{t), where B{t) is the standard Brownian process 
(Gaussian white noise) and a is some scaling constant. 
The uncertain stochastic excitation t7(t) makes the cor¬ 
responding system trajectory X{t) also stochastic. Let 
us say that the event £ occurs if A (t) leaves the bounding 
ellipsoid E during the time interval of interest [0,T]. 

The discretization of the excitation U is obtained by 
the standard discretization of the Brownian process: 


U{0) = 0, 

U (fc) = aB{kXt) = U{k — 1) + a'/AtZk 

k ( 38 ) 

= Q;-\/Aty^ Zi, 


where At = 0.1s is the sampling interval, k = 1,..., D = 
T/At, and Zi,..., Zu are i.i.d. standard Gaussian ran¬ 
dom variables. The target domain £ C is then 

£ = {{Zi,... ,Zd) ■■ max g(fc) > 1}, (39) 


where the system response g{k) at time t = kAt is 


9{k) 


Ai(fcAt)2 X2{kAtf {X^{kAt)-Rf 
i?2k + i?2 

(7 


Figure shows the solution of the unforced Lorenz 
system (with a = 0 so C/(t) = 0 ), and an example of the 
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FIG. 2: Top panel shows the estimate of the probability p£ of 
event £ where ck = 3 as a function of duration time T. For each 
value of T G [5,100], N = 10^ samples were used in MCS and 
n = 2 X 10^ samples per conditional level were used in the two 
versions of SS. The MCS and SS/Splitting estimates for p£ are 
zero for T < 15 and T < 12, respectively. The bottom panel shows 
the total computational effort automatically chosen by both SS 
algorithms. 


solution of the forced system (with a = 3) that corre¬ 
sponds to excitation U G £ (slightly abusing notation, 
U = U{Zi, ..., Zd) G £ means that the corresponding 
Gaussian vector {Zi, ..., Zo) G £). 

Monte Carlo Simulation: For 0 = 3, Figure shows 
the probability ps of event £ as a function of T estimated 
using standard MCS: 


N 


Ps 


■MCS 




(41) 


Z=1 


where = {z[^\ ..., Z'j^^) ^ 4>{z) are i.i.d. samples 
from the standard H-dimensional Gaussian PDF 4){z). 
For each value oi T, N = lO"* samples were used. When 
T < 25 the accuracy of the MCS estimate (41) begins 




to degenerate since the total number of samples N be¬ 
comes too small for the corresponding target probability. 
Moreover, for T < 15, none of the N generated MCS 
samples belong to the target domain £, making the MCS 
estimate zero. Figure]^ shows, as expected, that p£ is an 
increasing function of T, since the more time the system 
has, the more likely its trajectory eventually penetrates 
the boundary of ellipsoid E. 

Importance Sampling: IS is a variance reduction tech¬ 
nique and, as it was discussed in previous sections, its 
efficiency critically depends on the choice of the ISD q. 
Usually some geometric information about the target do¬ 
main £ is needed for constructing a good ISD. To get 
some intuition, Figurej^shows the domain £ for two lower 
dimensional cases: T = 1, At = 0.5 {D = 2) and T = 1.5, 
At = 0.5 {D — 3). Notice that in both cases, £ consists 
of two well separated subsets, £ = £_ U£+, which are ap¬ 
proximately symmetric about the origin. This suggests 
that a good ISD must be a mixture of two distributions 



FIG. 3: Left panel: visualization of the domain E in two dimen¬ 
sional case D = 2, where T = 1, At = 0.5, and o = 20. N = 10"^ 
samples were generated and marked by red circles (respectively, 
green dots) if they do (respectively, do not) belong to £. Right 
panel: the same as on the left panel but with D = 3 and T = 1.5. 


q- and < 7 +, that effectively sample £- and £+, 

= id£)±i±M (42) 

In this example, three different ISDs, denoted gi, 92 ) ftnd 
53 , are considered: 

Case 1: q±{z) = (/)(z| ± zg), where Z£ ^ (f)[z\£). That 
is, we first generate a sample Z£ G £ and then take 
ISD 51 as the mixture of Gaussian PDFs centered 
at Z£ and —Z£. 

Case 2: q±{z) = (j){z\ ± z^), where z^ is obtained as 
follows. First we generate n = 1000 samples from 
4>{z), and define z^ to be the sample in £ with the 
smallest norm. Sample z’^ can be interpreted as the 
“best representative” of £- (or f+), since (j){z£) has 
the largest (among generated samples) value. We 
then take ISD 52 as the mixture of Gaussian PDFs 
centered at and —z’^. 

Case 3: To illustrate what happens if one ignores the 
geometric information about two components of £, 
we choose 53 ( 2 :) = (j){z\z£), as given in Case 2. 

Let T = 1 and a = 20. The dimension of the uncer¬ 
tain excitation space is then D = 10. Table |l] shows the 
simulation results for the above three cases as well as for 
standard MCS. The IS method with 51 , on average, cor¬ 
rectly estimates p£. However the c.o.v. of the estimate is 
very large, which results in large fluctuations of the esti¬ 
mate in independent runs. IS with 52 works very well and 
outperforms MCS: the c.o.v. is reduced by half. Finally, 
IS with 53 completely misses one component part of the 
target domain £, and the resulting estimate is about half 
of the correct value. Note that the c.o.v. in this case is 
very small, which is very misleading. 

It was mentioned in previous sections that IS is often 
not efficient in high dimensions because it becomes more 
difficult to construct a good ISD [5l[32]. To illustrate this 
effect, IS with 52 was used to estimate p£ for a sequence of 
problems where the total duration time gradually grows 
from T = 1 to r = 10. This results in an increase of 
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TABLE I: Simulation results for IS and MCS. For each 
method, mean values {ps) of the estimates and their coef¬ 
ficient of variations 5{p£) are based on 100 independent runs. 

{pe) S{p£) 

MCS 3.4 X 10“® 17% 

IS qi 3.2 X 10“® 132.4% 

IS q2 3.4 X 10“® 8.3% 

IS qs 1.8 X 10“® 5.5% 



FIG. 4: Estimation of the target probability p£ as a function of 
duration time T. Solid red and dashed blue curves correspond to 
MCS and IS with ^ 2 , respectively. In this example, a = 20 and 
N = 10"^ samples for each value of T is used. It is clearly visible 
how the IS estimate degenerates as the dimension D goes from 10 
(T = 1) to 100 (T = 10). 


the dimension D of the underlying uncertain excitation 
space from 10 to 100. Figurej^shows how the IS estimate 
degenerates as the dimension D of the problem increases. 
While IS is accurate when D = 10 {T = 1), it strongly 
underestimates the true value of ps as D approaches 100 
(T = 10). 

Subset Simulation: SS is a more advanced simulation 
method and, unlike IS, it does not suffer from the curse 
of dimensionality. For a = 3, Figure shows the es¬ 
timate of the target probability p£ as a function of T 
using SS with MCMC and Splitting. For each value of 
T, n = 2 X 10^ samples were used in each conditional level 
in SS. Unlike MCS, SS is capable of efficiently simulating 
very rare events and estimating their small probabilities. 
The total computational effort, i.e. the total number N 
of samples automatically chosen by SS, is shown in the 
bottom panel of Figure Note that the larger the value 
of p£, the smaller the number of conditional levels in SS, 
and, therefore, the smaller the total number of samples 
TV. The total computational effort in SS is thus a decreas¬ 
ing function of T. In this example, the original MCMC 
strategy [J] for generating conditional samples outper¬ 
forms the splitting strategy m that exploits the causal¬ 
ity of the system: while the SS/MCMC method works 
even in the most extreme case (T = 5), the SS/Splitting 
estimate for pg becomes zero for T < 12. 


VII. CONCLUSION 

This chapter examines computational methods for 
rare-event simulation in the context of uncertainty quan¬ 
tification for dynamic systems that are subject to fu¬ 
ture uncertain excitation modeled as a stochastic pro¬ 
cess. The rare events are assumed to correspond to some 
time-varying performance quantity exceeding a specified 
threshold over a specified time duration, which usually 
means that the system performance fails to meet some 
design or operation specifications. 

To analyze the reliability of the system against this 
performance failure, a computational model for the 
input-output behavior of the system is used to predict the 
performance of interest as a function of the input stochas¬ 
tic process discretized in time. This dynamic model 
may involve explicit treatment of parametric and non- 
parametric uncertainties that arise because the model 
only approximately describes the real systems behavior, 
implying that there are usually no true values of the 
model parameters and the accuracy of its predictions are 
uncertain. In the engineering literature, the mathemat¬ 
ical problem to be solved numerically for the probabil¬ 
ity of performance failure, commonly called the failure 
probability, is referred to as the first-passage reliability 
problem. It does not have an analytical solution and 
numerical solutions must face two challenging aspects: 

1. The vector representing the time-discretized 
stochastic process that models the future system 
excitation lies in an input space of high dimension; 

2. The dynamic systems of interest are assume to be 
highly reliable so that their performance failure is a 
rare event, that is, the probability of its occurrence, 
P£, is very small. 

As a result, standard Monte Carlo Simulation and Im¬ 
portance Sampling methods are not computationally ef¬ 
ficient for first-passage reliability problems. On the other 
hand. Subset Simulation has proved to be a general and 
powerful method for numerical solution of these prob¬ 
lems. Like MCS, it is not affected by the dimension of 
the input space and for a single run, it produces a plot of 
P£ vs threshold b covering pg S [pq 1], where L is the 
number of levels used. For a critical appraisal of methods 
for first-passage reliability problems in high dimensions, 
the reader may wish to check Schueller et al [35] . 

Several variants of Subset Simulation have been de¬ 
veloped motivated by the goal of further improving the 
computational efficiency of the original version, although 
the efficiency gains, if any, are modest. All of them have 
an accuracy described by a coefficient of variation for the 
estimate of the rare-e vent p robability that depends on 
ln(l/p£) rather than y/ljpg as in standard Monte Carlo 
simulation. For all methods covered in this section, the 
dependence of this coefficient of variation on the num¬ 
ber of samples N is proportional to Therefore, 

in the case of very low probabilities, pg, it still requires 
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thousands of simulations (large N) of the response time 
history based on a dynamic model as in Q in order to 
get acceptable accuracy. For complex models, this com¬ 
putational effort may be prohibitive. 

One approach to reduce the computational effort when 
estimating very low rare-event probabilities is to utilize 
additional information about the nature of the problem 
for specific classes of reliability problems (e.g. mm)- 
Another more general approach is to construct surrogate 
models (meta-models) based on using a relatively small 
number of complex-model simulations as training data. 
The idea is to use a trained surrogate model to rapidly 
calculate an approximation of the response of the com¬ 
plex computational model as a substitute when drawing 


new samples. Various methods for constructing surro¬ 
gate models have been applied in reliability engineering, 
including response surfaces m, support vector machines 
[I1I23], neural networks lu, and Gaussian process mod¬ 
eling (Kriging) [^. The latter method is a particularly 
powerful one because it also provides a probabilistic as¬ 
sessment of the approximation error. It deserves further 
exploration, especially with regard to the optimal balance 
between the accuracy of the surrogate model as a func¬ 
tion of the number of training samples from the complex 
model, and the accuracy of the estimate of the rare-event 
probability as a function of the total number of samples 
from both the complex model and the surrogate model. 
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